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PRESENTATION OF SEARCH RESULTS 
USING DYNAMIC CATEGORIZATION 

FIELD OF THE INVENTION 

The present invention relates to information retrieval, and 5 
more specifically, to an approach for presenting search 
results using dynamic categorization. 

BACKGROUND OF THE INVENTION 

10 

Information systems provide for the storage, retrieval and 
sometimes management of data. Information is typically 
retrieved from an information system by submitting a query 
to the information system, where the query specifies a set of 
retrieval criteria. The information system processes the is 
query against a database and provides data that satisfies the 
search criteria (search results) to a user. 

The form of search results depends upon the context in 
which a particular search is performed. For example, in the 
context of a database search, search results might consist of 20 
a set of rows from a table. In the context of the global 
information network known as the "Internet", the search 
results might consist of links to web pages. 

For the purpose of explanation, the specific data items 
against which a search query is executed are referred to 25 
herein as searchable data items. The set of all searchable data 
items against which a query is executed is referred to herein 
as the searchable data set. The specific searchable data items 
that satisfy a particular query are referred to herein as 
matching data items. The set of all matching data items for 30 
a given query are referred to herein as the search results of 
the query. 

Processing a query containing general or generic search 
terms against a large searchable data set can result in a large 35 
number of unorganized matching data items, sometimes 
referred to as "hits." For example, processing a query 
containing general or generic terms on the Internet can 
generate millions of hits. 

On the Internet, search queries are processed by search ^ 
tools known as "search engines" that typically present a 
sequential list of matching data items ranked by relevance, 
from most relevant to least relevant. As a result, the match- 
ing data items that best satisfy the search criteria are 
presented at the top of the list, with the other matching data 45 
items presented further down the list in order of decreasing 
relevance. For example, web pages or web sites with web 
pages that contain the greatest number of the search terms 
receive the highest relevance ranking and are presented at 
the top of the list. 50 

Because the search results are presented serially, with 
approximately ten to twenty hits per page, reviewing a large 
number of hits, for example several thousand, or even only 
several hundred hits, is often impractical. This is not nec- 
essarily a problem in situations where the relevancy ranking 55 
drops off quickly after a relatively few number of hits 
because a user will typically only view the most relevant 
matching data items. However, in situations where a large 
number of hits have a high relevancy ranking, it can be 
impractical to review all of the most relevant hits. 60 

One alternative approach for presenting search results is 
the static category approach. The static category approach 
involves pre-assigning all searchable data items to pre- 
defined or "static" subject matter categories based upon their 
content When a search is performed, a relatively fewer 65 
number of categories that satisfy the search criteria are 
displayed instead of or, in addition to, the actual matching 
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data items. The members of those static categories (which 
may or may not satisfy the search criteria) can then be 
accessed through the categories. 

In the context of the Internet, for example, all web pages 
and web sites containing subject matter relating to the topic 
of baseball would be statically assigned to a baseball cat- 
egory. When a query containing the term "baseball" is 
processed, the baseball category is displayed, instead of or 
in addition to, all of the individual web pages that satisfy the 
query terms. A user can then select the baseball category to 
view the web pages and web sites assigned to the baseball 
category. Categories containing a large number of search- 
able data items can be divided into sub-categories to create 
a statically-defined category hierarchy. 

Although the static category approach is helpful in allow- 
ing a user to navigate through a large number of searchable 
data items in an organized manner, it suffers from several 
drawbacks. First, if the amount of information being 
searched is large, a large amount of resources can be 
required to pre -assign all of the searchable data items to 
categories. Furthermore, when the searchable data set 
changes, the category assignments must be updated to reflect 
the changes. For example, if new searchable data items are 
added to the searchable data set and the categories are not 
updated to reflect the new searchable data items, then a user 
cannot access the new searchable data items through the 
categories. As a result, the new searchable data items that 
cannot be accessed through the categories are effectively 
lost. 

Another drawback to the static category approach is that 
the statically-defined categories may not be helpful in find- 
ing information that does not fit squarely into the predefined 
categories. Thus, a search may result in the display often 
categories, where each of the ten categories has a relatively 
low degree of relevance. 

These problems are particularly acute on the Internet for 
at least two reasons. First, the Internet provides access to a 
vast amount of information which requires an enormous 
amount of resources to assign searchable data items to 
categories. Secondly, the information available through the 
Internet is constantly changing and new information is being 
added at an astounding rate. Consequently, a large amount of 
resources is required to maintain static categories that do not 
necessarily reflect all of the searchable data set Therefore, 
based upon the need to present a large number of matching 
data items in an organized manner and the limitations of 
prior approaches, an approach for presenting a large number 
of matching data items in an organized manner that does not 
suffer from the limitations of prior approaches is highly 
desirable. 

SUMMARY OF THE INVENTION 

According to one aspect of the invention, a method is 
provided for presenting search results using dynamic cat- 
egorization. The method comprises the steps of receiving 
search results, dynamically establishing one or more search 
result categories based upon attributes of the search results 
and presenting one or more category identifiers correspond- 
ing to the one or more search result categories. 

According to another aspect of the invention, a method is 
provided for presenting search results on a user interface 
using dynamic categorization. The method comprises the 
steps of dynamically establishing one or more search result 
categories based upon attributes of the search results and 
displaying on the user interface one or more interface 
objects corresponding to the one or more search result 
categories. 
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According to another aspect of the invention, a computer hierarchy. Finally, dynamic categorization may be used to 
system is provided for presenting search results to a user modify search queries, as described in more detail herein- 
using dynamic categorization. The computer system com- after. 

prises a user interface, one or more processors and a memory FIG. 1 is a flow chart 100 illustrating an approach for 

coupled to the one or more processors. The memory contains 5 presenting search results using dynamic categorization 

one or more sequences of one or more instructions which, according to an embodiment of the invention. After starting 

when executed by the one or more processors, cause the in step 102, in step 104 search results are received, [n step 

computer system to perform the steps of receiving search 106, the search results are examined and one or more search 

results, dynamically establishing one or more search result result categories are dynamically established based upon 

categories based upon attributes of the search results and 10 attributes of the matching data items that satisfy the query, 

displaying on the user interface one or more category In step 108, the search results are presented to a user based 

indicators corresponding to the one or more search result upon the one or more search result categories, as described 

categories. in more detail hereinafter. Finally, the process is complete in 

step 110. 

BRIEF DESCRIPTION OF THE DRAWINGS 15 i. DYNAMICALLY DETERMINING CATEGORIES 
Embodiments of the invention are illustrated by way of . Dynamically determining categories involves identifying 
example, and not by way of limitation, in the figures of the similarities and/or dissimdariUes of attributes in the match- 
accompanying drawings and in which like reference numer- ™% d f ta ltcm * an <? catabhshing a set of candidate categories 
als refer to similar elements and in which: based u P on tne identified similarities and/or dissimilarities. 
„ . , . , , . „ . . t 20 The nature of the attributes used to determine similarities 
FIG. 1 is a high-level &ow chart dlustrating an approach dissimilarities meT based on the nature of ^ 

for presenting search results using dynamic categorization ... , . ^ , - c 4U . , . , . 

r & , * , . . matching data items. For example, if the matching data 

according to an embodiment of the invention; items &fe structured records , tne attributes used to determine 

FIG. 2 is a detailed flow chart illustrating an approach for lne categories may be selected fields of the structured 

presenting search results using dynamic categorization ^ recorc i St On the other hand, if the matching data items are 

according to another embodiment of the invention; relatively unstructured text-based electronic documents, 

FIG. 3A is a block diagram illustrating a user interface for then the attribute values used to determine categories may 

presenting search results using dynamic categorization simply be similarity coefficients that have been generated 

according to an embodiment of the invention; based on comparisons between the text contents of the 

FIG. 3B is a block diagram illustrating a user interface for 30 documents, 

presenting search results using dynamic categorization and The candidate categories may be filtered or otherwise 

sub -categories according to an embodiment of the invention; processed to select an appropriate number of final categories 

FIG. 3C is a block diagram illustrating a user interface for from f the candidate categories. In situations where the num- 

presenting search results using dynamic categorization and ber of candidate categories is sufficiently small, the filtering 

user-selectable categories according to an embodiment of 35 may not be necessary. Ideally, the number of final categories 

the invention- and 15 seated so that when the final categories are presented to 

. . * , ,. r , 4 ... a user, the user can review the final categories in a relatively 

FIG. 4 is a block diagram of a computer system on which , * ■ j a i *l * i u <?c i 

, - . . . t . / . snort period of time. Accordingly, the actual number of final 

embodiments of the invention may be implemented. . . A j u *u *u • * 

J r categories necessarily depends upon both the requirements 

DETAILED DESCRIPTION OF THE 40 of a particular application and the way in which the final 

INVENTION categories are presented to the user. 

Once the final categories are determined, the matching 

In the following description, for the purposes of data items are assigned to the final categories and the final 

explanation, specific details are set forth in order to provide categories are presented to the user. The steps of determining 

a thorough understanding of the invention. However, it will 45 candidate categories, determining final categories based 

be apparent that the invention may be practiced without upon lhe candidate categories and assigning the matching 

these specific details. In other instances, well-known struc- dala i tems t0 me nna i categories are collectively referred to 

tures and devices are depicted in block diagram form in ^ "clustering." The particular clustering technique used 

order to avoid unnecessarily obscuring the invention. depends upon the particular requirements of an application 

FUNCTIONAL OVERVIEW 50 anc * mvent ^ on ^ not limited to any particular clustering 

technique. Examples of clustering techniques include Baye- 

In general, search results are presented using dynamic sian clustering, neural networks, Jaccard similarity 

categorization. Dynamic categorization involves examining coefficients, semantic analysis and various natural language 

search results and dynamically establishing one or more processing algorithms. The particular clustering algorithm 

search result categories based upon attributes of the search ss used may be user-defined. 

results. As described in more detail hereinafter, a varied of The approach of presenting search results using dynamic 

grouping or clustering techniques may be used to dynami- categorization is now described with reference to the flow 

cally establish the search result categories. The search result chart 200 of FIG. 2. After starting in step 202, in step 204 

categories are then presented using category indicators, as search results are received. The particular way in which a 

described in more detail hereinafter. eo search is performed is not germane to embodiments of the 

Dynamic categorization allows search result categories to invention and embodiments of the invention are not limited 

be generated on a search-by-search basis while ensuring that to any particular type of search. 

all matching data items are assigned to at least one search In step 206, a determination is made as to whether initial 

result category. As a result, categories may be tailored to criteria are satisfied. According to one embodiment of the 

each set of search results and based on user or application 65 invention, the initial criteria include a minimum number of 

preferences. Dynamic categorization may be used in com- search results. If the number of matching data items are 

bination with static categories to provide a hybrid category below a minimum threshold, then dynamic categorization is 
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not used and traditional presentation approaches are used In step 220, the qualifying data items are assigned to the 

instead. Another example of the initial criteria is whether the categories. For example, the hits having the compact car 

search results consist of data from more than one data source attribute are assigned to the compact car category. For hits 

(e.g. different databases, such as a real time query and a having attributes of categories that were collapsed into 

static database query), where dynamic categorization is used 5 broader categories, those hits are assigned to the broader 

to combine the data from the different sources to be pre- category. For example, if the mid-size car and fill size car 

sented to the user. If the initial criteria are not satisfied, then categories are collapsed into a single full size car category, 

the process is complete in step 224. then all of the hits having the mid-size car attribute are 

If, however , in step 206, a determination is made that the included in the full size car category. In step 222, the 

initial criteria are saiisnedrthen in step 208~the matching 10 categories and qualifying data items are presented to the 

data items (search results) are filtered to generate filtered user, as described in more detail hereinafter. The process is 

search results. According to one embodiment of the complete in step 224, 

invention, the matching data items are filtered by a relevan ce In steps 214 and 216, more than one algorithm may be 

threshold. Traditi onal search techniqu esprovifle a relevancy used to produce a number of groupings according to one 

rating foTsearcifresults that indicates how well individual 15 embodiment of the invention, an optimal grouping may be 

matching data items satisfy the search criteria In situations selected as the grouping presented to the user. An optimal 

where a query results in a large number of matching data grouping is typically determined based upon the require- 

items, it is often useful to reduce the amount of matching ments of a particular application. For example, grouping by 

data items by discarding matching data items that do not one attribute may produce more categories than grouping by 

satisfy a minimum relevance threshold. 20 another attribute. Conversely, some groupings may cluster 

For examp le, fo r particular search results containing a results with similar relevance scores (which may be inde- 

large amount of data, all matching data items having a pendent of the categorization criteria). This may be more 

r elevanc y of less than fifty perce nt might b e discarded . preferable in some circumstances than groupings with 



Acc ording to another embodimen t ofT Ee invention, a pa r- smaller number of categories. 

ticular number of the most relevant nits are retained, with d ie 25 An application can also have access to the different 

r emaining hits being discarded. F or example, suppose^ a groupings formed during steps 214 and 216, so that the 

determi nation is made that at most one hundred hits are application or the user may elect to view a different grouping 

desired. & particular search is performed and the searc h other than the one initially selected for presentation. This 

re sults include twenty thousand hits. In this situation t he ability to take different views of what is basically the sane 

relevancy ratings for the matching data items are used t o 30 large collection of data is akin to doctors using X-ray, MRI, 

i dentity and 1 keep the one hundred most relevant hits an d and CatScan to look at the same tumor in different ways in 

d iscarcL t he remainin g nineteen th o usand, n ine hundred hit s. order to understand it better. 

Tor 7ne~prp5Seof explanation, the matching data items 2. PRESENTING SEARCH RESULTS 
that are not discarded during the filtering process are FIG. 3 A illustrates a user interface 300 for presenting 
referred to herein as qualifying data items. Thus, in the 35 search results using dynamic categorization according to an 
example given above, the query resulted in twenty thousand embodiment of the invention. User interface 300 may be 
matching data items, but only one hundred qualifying data implemented in any combination of discrete hardware cir- 
items. cuitry and computer software. Typically, user interface 300 
In step 210, the qualifying data items are optionally sorted is provided as a graphical representation on a computer 
by one or more attributes to generate sorted search results. 40 screen that is generated by the execution of sequences of 
For example, in the context of search results that include instructions by one or more processors, 
addresses, the search results might be sorted by zip code. Categories that are dynamically determined in accordance 
In step 212, common attribute values among the qualify- with embodiments of cw the invention are presented using 
ing data items are identified. The common attribute values category indicators. A category indicator is any object that is 
are specific to each set of search results. For example, for 45 capable of representing a category. Since the invention is not 
search results pertaining to automobiles, common attribute limited to any particular medium for presenting search 
values may include compact cars, mid-size cars, fill size results, the type of category indicator may vary depending 
cars, and sports cars. upon the requirements of a particular application. For 
In step 214, similarity data is determined for the search example, for presenting search results on a user interface, a 
results that indicates the occurrence of the common attribute 50 user interface object may be used as a category indicator, 
values among the qualifying data items. For example, the The user interface object may provide some indicia that it 
similarity data would indicate how many of the hits in the corresponds to a particular category of search results, 
filtered search results have the attribute values of compact dynamically determined in accordance with embodiments of 
cars, mid-size cars, full size cars, and sports cars, respec- the invention. For presenting search results in a data file or 
lively. In step 216, the search results are grouped based upon 55 on a printer, a category indicator may include a text string 
the similarity data. For example, the qualifying data items identifying the corresponding category, 
having the compact car attribute value are grouped together Referring to the prior example of search results pertaining 
and the hits in the search results having the mid-size car to automobiles, user interface 300 includes three category 
attribute value are grouped together. indicators 302, 304 and 306 that correspond to the 
In step 218, one or more categories are selected based 60 dynamically-determined categories previously described, 
upon the groupings. According to one embodiment of the Category indicator 302 corresponds to the category "auto- 
invention, the one or more categories are selected by a mobiles: compact cars" and includes two qualifying data 
majority vote. Specifically, the categories having the most items from the search results, designated by the reference 
qualifying data items are selected. Categories having rela- numeral 308. Qualifying data items 308 include compact 
tively few numbers of qualifying data items are collapsed 65 cars "Tango" and "Foxtrot". Category indicator 304 corre- 
into broader categories, so as to reduce the total number of sponds to the category "Automobiles: Full Size Cars" that 
selected categories. includes qualifying data items 310. Qualifying data items 
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310 include full size cars, "Zebra," "Elephant" and "Rhino." assigning the scores to the categories, (e.g. scores are very 

Category indicator 306 corresponds to the category "Auto- similar), another ordering (such as alphabetical) may be used 

mobiles: Sports Cars" that includes a qualifying data item as a tie breaker. Again, the user and the application may have 

312. Qualifying data item 312 is a sports car "Spark." complete control on which algorithm is used, and can select 

For purposes of illustration, in FIG, 3A the qualifying data 5 different algorithms, 

items 308, 310, 312 and 314 are displayed with their 3 - SUB-CATCGORIES 

respective category indicators 302, 304 or 306. However, Dynamic categorization may also be used to generate 

according to another embodiment of the invention, qualify- sub -categories. Generating sub-categories is particularly 

ing data items 308, 310, 312 and 314 are not initially useful when a category has a large number of hits. For 

displayed. Rather, only category indicators 302, 304 and 306 10 eX ? mple ' refemng f° J f > in jhe SLtuation where the 

■ n j- i j , j a. ♦ c c *• category corresponding to category indicator 304 contains a 

arc initially displayed to reduce the amount of information { * number £ f ^ s^^ories are generated and 

SiU^fn?!^"^ ; ™ c tfw^B d *t a sterns sub b category i adicat0 rs 316 and 318 corresponding to the 

308,310,312 and 314 are displayed in response to a user sub-categories are presented on user interface 300. The 

selection of category indicators 302, 304 and 306. For sub-categories corresponding to sub-category indicators 316 

example, in response to a user selection of category indicator 15 and 318 are generated based upon attributes of qualifying 

302, qualifying data items 308 are displayed. In response to data items 310 contained in the category corresponding to 

another user selection (de-selection) of category indicator category indicator 304. 

302, qualifying data items 308 are undisplayed from user In the present example, qualifying data items 310 have a 

interface 300, This is particularly helpful when category price attribute which is used to generate the sub-categories 

indicator 302 contains a sufficiently large number of quali- 20 that correspond to sub-category indicators 316 and 318. 

fying data items 308 such that other category indicators 304 Specifically, the sub-category corresponding to sub-category 

and 306 cannot be displayed simultaneously with the mem- indicator 316 is generated for bits having a price attribute of 

bers of the category associated with category indicator 302. less than $25,000. In the present example, this sub-category 

User interface 300 also includes an indicator 314 identi- includes entries 320 "Zebra" and "Elephant." On the other 

fied as "<more in this category>." In response to the 25 hand, the sub -category corresponding to sub-category indi- 

selection of indicator 314 by a user, additional hits in the cator 318 is generated for hits having a price attribute of 

category corresponding to category indicator 304 are dis- more than $25,000. This sub -category includes a hit 322 

played on user interface 300. Indicator 314 provides the "Rhino." The sub-category corresponding to sub-category 

benefit of informing a user that additional hits for the indicator 318 also includes a hit 324 designated as "<more 

category corresponding to category indicator 304 are 30 in this category >" that provides access to additional hits in 

available, without over-cluttering user interface 300. sub -category 318. 

For example, if qualifying data items 308, 310 and 312 are According to one embodiment of the invention, sub- 
structured records, the text titles may be derived from fields category indicators 316 and 318 and hits 318, 320 and 322 
in the structured records. In the present example, both of the are not initially displayed under category indicator 304. In 
qualifying data items 308, namely "Tango" and "Foxtrot" 35 response to a user selection of category indicator 304, 
may have a "compact car" field. In circumstances where sub -category indicators 316 and 318 are displayed, but not 
qualifying data items 308, 310 and 312 are relatively hits 318, 320 and 322. Then, in response to a user selection 
unstructured text-based electronic documents, then category of subcategory indicators 316 and 318, hits 318, 320 and 322 
indicators 302, 304 and 306 may not be displayed at all. are displayed, respectively. This is particularly helpful when 
Instead, the first qualifying data item in qualifying data items 40 the category corresponding to category indicator 304 con- 
308, 310 and 312, namely "Tango," "Zebra," and "Spark" tains a large number of hits. Sub-category indicators 316 and 
would be displayed on user interface 300 followed by a 318 may also be de-selected and undisplayed as previously 
user-selectable "<more like this>" indicator. This approach described with respect to category indicators 302, 304 and 
displays a representative qualifying data item in qualifying 306. 

data item 308, 310 and 312 while allowing a user to easily 45 4. USER-SELECTABLE CATEGORIES 

view the remaining qualifying data items by selecting the According to another embodiment of the invention, a set 

"<more like this>" indicator. The text titles provided with of one or more candidate categories are presented to a user 

category indicators 302, 304 or 306 are derived from and the user is permitted to select one or more of the 

attributes of their respective qualifying data items 308, 310 candidate categories, and/or one or more sets of candidate 

and 312. 50 categories, to be used as the final categories to present the 

Categories within a group may be presented to users in search results. Once the user selects the final categories, the 

any order. However, some orderings may be preferable to qualified data items are assigned to the final categories and 

others. For example, a group by unit price range may be the final categories and search results are presented to the 

more suitably displayed initially sorted by price range. A user. 

common way of presenting groups during "fuzzy" searches 55 As illustrated in FIG. 3C, user interface 300 includes a set 
(where matches aren't exact) is by relevance. A category of user-selectable category indicators 330 corresponding to 
relevance rating can be calculated for each category, and the categories that have been determined using the dynamic 
categories can then be presented in relevance sorted order. categorization approach described herein A user may select 
Category relevance can be calculated in any number of one or more of the user-selectable category indicator 330 to 
ways depending on the requirements of a particular appli- 60 be used in presenting the search results to the user. This 
cation. One way is to assign the highest relevance score of provides a user with the flexibility to choose specific cat- 
any item in the category as the category's score. This has the egories to be used to categorize the search results. User 
effect of elevating groups containing at least one high interface 300 also includes a set of hit counts 332 that 
scoring item to the top. Another way is to assign the average indicate how many hits are assigned to each of the user- 
score of all items in the category as the category's score. Yet 65 selectable categories 330. The hit counts 332 provide infor- 
another way is to use the median, or a weighted average. In mation that may help the user determine which of the 
the case where there isn't a clear ordering even after user-selectable categories he or she might want to chose. 
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According to one embodiment, the user may select one or 
more sets of categories, where the categories within one set 
are established based on different attributes than the catego- 
ries within the other sets. For example, one set of categories 
may group cars according to their size, while another set of 5 
categories groups cars according to their price range, while 
yet another set of categories groups cars according to their 
manufacturer. The user may then select specific categories 
from one or more of the category sets on a category by 
category basis, or on an entire category-set by category-set 10 
basis. 

Significantly, when some final categories are generated 
based on different attributes than other final categories, then 
it is possible for the same qualifying data item to be assigned 
to more final one of the final categories. For example, if a 15 
user selects a particular car size category as a final category, 
a particular price range category as a final category, and a 
particular manufacturer category as a final category, it is 
possible for a qualifying data item that contains information 
about a particular car to fall into all three of the selected 20 
categories. 

5. USING DYNAMIC CATEGORIZATION WITH 
STATIC CATEGORIES 

Dynamic categorization may also be used with static 
categories. Using dynamic categorization with static catego- 25 
ries is particularly helpful when a static category includes a 
large number of hits. Under these circumstances, dynamic 
categorization may be used to determine one or more 
sub -categories to organize the hits contained in the particular 
category. Dynamic categorization is also particularly helpful 30 
when certain hits are not assigned to any static categories. 
These hits are often referred to as "orphan hits." Additional 
categories may be generated for the orphan hits using the 
dynamic categorization approach described herein. 

For example, referring to FIG. 3B, suppose that category 35 
indicator 304 is a static category that contains a large 
number of hits. Under these circumstances, dynamic cat- 
egorization is useful to dynamically determine sub- 
categories, as previously described, to provide additional 
organization to the hits contained in the static category 40 
corresponding to static category indicator 304. If the sub- 
categories contain too many hits, then additional sub- 
categories may be generated. The additional sub-categories 
may be added to static category associated with category 
indicator 304 or to the sub-categories associated with sub- 45 
category indicators 316 and 318. 

6. MODIFYING SEARCH CRITERIA USING DYNAMIC 
CATEGORIZATION 

Dynamic categorization may also be used to modify 
search criteria to be used in subsequent searches. A search 50 
query may be modified (broadened or narrowed) based upon 
dynamic categories determined by dynamic categorization. 
Specifically, query terms that correspond to dynamic cat- 
egories may be added to a search query, replace existing 
query terms or be used instead of existing query terms. For 55 
example, suppose in the prior example the original query 
was "automobile". The original query may be modified to 
add the term "sports cars" to form a new query "automobile 
AND sports cars" when the user selects the category iden- 
tifier for the dynamically determined "sports car 3 * category. 60 
As another example, the original query may be modified to 
just "sports cars". Care must be taken not to overly narrow 
a search query by adding in too many terms associated with 
dynamic categories. For example, the search query "auto- 
mobiles AND compact cars AND full size cars AND sports 65 
cars" may not yield any search results. Each category may 
optionally have keywords associated with it which can be 
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used in narrowing the search (used as AND or OR terms). 
The keywords can be statically defined in a dictionary, or 
may be dynamically generated by looking for the most 
common words in items in each category. It may be advan- 
tageous to use AND terms more sparingly than OR terms 
since they may overly limit the search. 

The invention is not limited in its application to any 
particular type of search results. Rather, dynamic categori- 
zations may be used with any type of search results. Further, 
although dynamic categorization has been described herein 
primarily in the context of categorizing search results from 
a new search, dynamic categorization may also be used with 
portions of search results. For example, dynamic categori- 
zation may be applied to a locally cached portion of search 
results and optionally extended to the remaining portions of 
the search results, i.e. the portions of the search results that 
are remotely stored. In addition, the approach described 
herein may be applied to locally cached search results that 
are periodically updated by background search processes. 
Thus, the approach described herein may be applied to any 
portion of search results. 

Embodiments of the invention are also applicable to 
real-time search applications where after a query is 
processed, matching data items are received and categories 
have already been dynamically determined as described 
herein, additional matching data items are received. In this 
circumstance, the additional matching data items are exam- 
ined and added to the existing categories if possible. For 
example, additional matching data items that have attributes 
that are sufficiently similar to attributes of the existing 
categories can be added to those categories. The additional 
matching data items that cannot be assigned to existing 
categories may be retained as part of the search results and 
included in the next dynamic categorization. As a result, 
when a user elects to re-categorize, then all of the additional 
matching data items may be assigned to categories. 
7. IMPLEMENTATION MECHANISMS 

The approach for presenting search results using dynamic 
categorization as described herein may be implemented in 
discrete hardware circuitry, in computer software, or a 
combination of discrete hardware circuitry and computer 
software. 

FIG. 4 is a block diagram that illustrates a computer 
system 400 upon which embodiments of the invention may 
be implemented. Computer system 400 includes a bus 402 
or other communication mechanism for communicating 
information, and a processor 404 coupled with bus 402 for 
processing information. Computer system 400 also includes 
a main memory 406, such as a random access memory 
(RAM) or other dynamic storage device, coupled to bus 402 
for storing information and instructions to be executed by 
processor 404. Main memory 406 also may be used for 
storing temporary variables or other intermediate informa- 
tion during execution of instructions to be executed by 
processor 404. Computer system 400 further includes a read 
only memory (ROM) 408 or other static storage device 
coupled to bus 402 for storing static information and instruc- 
tions for processor 404. A storage device 410, such as a 
magnetic disk or optical disk, is provided and coupled to bus 
402 for storing information and instructions. 

Computer system 400 may be coupled via bus 402 to a 
display 412, such as a cathode ray tube (CRT), for displaying 
information to a computer user. An input device 414, includ- 
ing alphanumeric and other keys, is coupled to bus 402 for 
communicating information and command selections to 
processor 404. Another type of user input device is cursor 
control 416, such as a mouse, a trackball, or cursor direction 
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keys for communicating direction information and com- 
mand selections to processor 404 and for controlling cursor 
movement on display 412. This input device typically has 
two degrees of freedom in two axes, a first axis (e.g., x) and 
a second axis (e.g., y), that allows the device to specify 
positions in a plane. 

The invention is related to the use of computer system 400 
for presenting search results using dynamic categorization. 
According to one embodiment of the invention, the presen- 
tation of search results using dynamic categorization is 
provided by computer system 400 in response to processor 
404 executing one or more sequences of one or more 
instructions contained in main memory 406. Such instruc- 
tions may be read into main memory 406 from another 
computer-readable medium, such as storage device 410. 
Execution of the sequences of instructions contained in main 
memory 406 causes processor 404 to perform the process 
steps described herein. One or more processors in a multi- 
processing arrangement may also be employed to execute 
the sequences of instructions contained in main memory 
406. In alternative embodiments, hard-wired circuitry may 
be used in place of or in combination with software instruc- 
tions to implement the invention. Thus, embodiments of the 
invention are not limited to any specific combination of 
hardware circuitry and software. 

The term "computer-readable medium" as used herein 
refers to any medium that participates in providing instruc- 
tions to processor 404 for execution. Such a medium may 
take many forms, including but not limited to, non-volatile 
media, volatile media, and transmission media. Non-volatile 
media includes, for example, optical or magnetic disks, such 
as storage device 410. Volatile media includes dynamic 
memory, such as main memory 406. Transmission media 
includes coaxial cables, copper wire and fiber optics, includ- 
ing the wires that comprise bus 402. Transmission media can 
also take the form of acoustic or light waves, such as those 
generated during radio wave and infrared data communica- 
tions. 

Common forms of computer- re ad able media include, for 
example, a floppy disk, a flexible disk, hard disk, magnetic 
tape, or any other magnetic medium, a CD-ROM, any other 
optical medium, punch cards, paper tape, any other physical 
medium with patterns of holes, a RAM, a PROM, and 
EPROM, a FLASH-EPROM, any other memory chip or 
cartridge, a carrier wave as described hereinafter, or any 
other medium from which a computer can read. 

Various forms of computer readable media may be 
involved in carrying one or more sequences of one or more 
instructions to processor 404 for execution. For example, the 
instructions may initially be carried on a magnetic disk of a 
remote computer. The remote computer can load the instruc- 
tions into its dynamic memory and send the instructions over 
a telephone line using a modem. A modem local to computer 
system 400 can receive the data on the telephone line and 
use an infrared transmitter to convert the data to an infrared 
signal. An infrared detector coupled to bus 402 can receive 
the data carried in the infrared signal and place the data on 
bus 402. Bus 402 carries the data to main memory 406, from 
which processor 404 retrieves and executes the instructions. 
The instructions received by main memory 406 may option- 
ally be stored on storage device 410 either before or after 
execution by processor 404. 

Computer system 400 also includes a communication 
interface 418 coupled to bus 402. Communicatioo interface 
418 provides a two-way data communication coupling to a 
network link 420 that is connected to a local network 422. 
For example, communication interface 418 may be an 
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integrated services digital network (ISDN) card or a modem 
to provide a data communication connection to a corre- 
sponding type of telephone line. As another example, com- 
munication interface 418 may be a local area network 

5 (LAN) card to provide a data communication connection to 
a compatible LAN. Wireless links may also be implemented. 
In any such implementation, communication interface 418 
sends and receives electrical, electromagnetic or optical 
signals that carry digital data streams representing various 

10 types of information. 

Network link 420 typically provides data communication 
through one or more networks to other data devices. For 
example, network link 420 may provide a connection 
through local network 422 to a host computer 424 or to data 

15 equipment operated by an Internet Service Provider (ISP) 
426. ISP 426 in turn provides data communication services 
through the world wide packet data communication network 
now commonly referred to as the "Internet" 428. Local 
network 422 and Internet 428 both use electrical, electro- 

20 magnetic or optical signals that carry digital data streams. 
The signals through the various networks and the signals on 
network link 420 and through communication interface 418, 
which carry the digital data to and from computer system 
400, are exemplary forms of carrier waves transporting the 

25 information. 

Computer system 400 can send messages and receive 
data, including program code, through the network(s), net- 
work link 420 and communication interface 418. In the 
Internet example, a server 430 might transmit a requested 

30 code for an application program through Internet 428, SP 
426, local network 422 and communication interface 418. In 
accordance with the invention, one such downloaded appli- 
cation provides for presenting search results using dynamic 
categorization as described herein. 

35 The received code may be executed by processor 404 as 
it is received, and/or stored in storage device 410, or other 
non-volatile storage for later execution. In this manner, 
computer system 400 may obtain application code in the 
form of a carrier wave. 

40 The approach for presenting search results using dynamic 
categorization as described herein provides several advan- 
tages over prior approaches for presenting search results. 
First, a large number of search results can be presented to a 
user in an organized manner without the loss of information. 

45 This elimi nates the need to reduce the amount of search 
results by narrowing search criteria In addition, since 
dynamically-determined categories are based upon the 
attributes of particular search results, the dynamically deter- 
mined categories are customized to each set of search 

50 results. In particular, this allows unique sets of sub- 
categories to be generated for each set of search results. 
Furthermore, the approach for presenting search results 
using dynamic categorization as described herein may be 
implemented using any type of clustering technique. Finally, 

55 dynamically-determined categories can be used to modify 
search criteria to aid in subsequent searches. 

In the foregoing specification, the invention has been 
described with reference to specific embodiments thereof. It 
will, however, be evident that various modifications and 

60 changes may be made thereto without departing from the 
broader spirit and scope of the invention. The specification 
and drawings are, accordingly, to be regarded in an illus- 
trative rather than a restrictive sense. 
What is claimed is: 

65 1. A method for presenting search results, the method 
comprising the steps of: 
receiving search results; 
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dynamically establishing one or more search result cat- 11. The method as recited in claim 10, further comprising 

egories based upon attributes of the search results by the step of in response to a user selection, presenting search 

identifying common attributes among the search results associated with the one or more sub-categories. 

results, 12. A method for presenting search results comprising the 

generating a set of one or more coefficients that reflect 5 steps of: 

the similarity or dissimilarity of the search results . . . lA 

, j ;, t . i , receiving search results: 

based upon the common attributes, t ° * 

grouping the search results based upon the set of one or dynamically establishing one or more search result cat- 
more coefficients, and egories based upon attributes of the search results; 

selecting the one or more categories based upon the presenting one or more category identifiers corresponding 

grouping of the search results; and to the one or more search result categories; and 

presenting one or more category identifiers corresponding presenting one or more static category identifiers corre- 

to the one or more search result categories. spending to one or more static search result categories. 

k ,T a& l \ ? ' eV , ery 13- The method as recited in claim 12, farther comprising 

member or the one or more search result categories is a data ^ steDS of 

item that satisfies criteria specified in a query that produced 35 " . ' 

the search results. presenting first search results corresponding to the one or 

3. The method as recited in claim 1, wherein the step of morc search result categories, and 

identifying common attributes among the search results is presenting second search results corresponding to the one 

performed using Bayesian clustering techniques. or more static search result categories. 

4. The method as recited in claim 1, wherein the step of 20 14. A method for presenting search results comprising the 
identifying common attributes among the search results is steps of: 

performed using a neural network. . 4 i *• i* c *u 

r c ~ ,« % •* j • i * -■ i_ • in response to a user selection of one or more of the one 

5. The method as recited in claim 1. wherein r ,. , A 4 ., 

a- ■ A t j a= - \ j or more candidate category identifiers, establishing one 

the coefficients are Jaccard coefficients, and G i u i* * ■ V j l * l 

«. . , or more final search result categories based upon both 

thestep of generating a set of one or more coefficients that 25 ^ Qne of mQr6 candidat6 resu]t cat6gories and 

reflect the similarity of the search results based upon t ^ e uscr selection* and 

the common attributes includes the step of generating a ' 

set of one or more Jaccard coefficients that reflect the presenting one or more final category identifiers corre- 

similarity of the search results based upon the common sponding to the one or more final search result catego- 

attributes. 30 ries. 

6. The method as recited in claim 1, wherein 15. A method for presenting search results on a user 
the search results are first search results, interface, the method comprising the steps of: 

the method further comprises the step of applying rel- displaying on the user interface one or more user interface 

evance criteria to the first search results to generate objects corresponding to the one or more search result 

second search results that satisfy the relevance criteria, 35 categories that have been dynamically established 

an ^ based upon attributes of the search results; and 

the step of dynamically establishing one or more search displaying on the user interface one or more user interface 

result categories based upon attributes of the search objects correS p 0nding to one or more static categories, 

results includes the step of dynamically establishing lfi ^ method afi redted {n cUim 15 rfsi 

one or more search result categories based upon ^ me step of responding to a user selection of a particular user 

attributes of the second search results. ■ # u u* • * *u * ; -r i_- 

_ ™ A , , * ~ • i * ^ t . interface object from the one or more user interface obiects 

7 The method as recited in claim 1, wherein by displaymg on the user interface search results associated 

the method further comprises the step of sorting the with a particular search result category corresponding to the 

search results by the attributes of the search results to particular user interface object, 

generate sorted search results, and 45 17 ^ method as redted in claim 15 mrlher 

the step of dynamically establishing one or more search the step of in response to a first user selection of a first user 
result categories based upon attributes of the search interface object from the one or more user interface objects, 
results includes the step of dynamically establishing displaying on the user interface one or more sub-category 
one or more search result categories based upon user interface objects corresponding to one or more sub- 
attributes of the sorted search results. 50 categories, wherein the one-or-more sub -categories are asso- 

8. The method as recited in claim 1, wherein the search c i ated witrj t he category corresponding to the first user 
results include a plurality of matching data items and the interface object, the one or more sub-categories being 
method further comprises the step of assigning the matching dynamically determined based upon the attributes of the 
data items to the one or more search result categories. search results. 

9. The method as recited in claim 1, further comprising ss ig, Th e me thod as recited in claim 17, further comprising 
the step of in response to a user selection, presenting search the step of in response to a second user selection of the first 
results associated with the one or more search result cat- user interface object, undisplaying from the user interface 
egories. tne one or more sub-category user interface objects. 

10. The method as recited in claim 1, wherein the method 19. ^ e me thod as recited in claim 17, further comprising 
farther comprises the steps of 60 tne step of in reS p 0nse t0 a second user selection of the one 

dynamically establishing one or more search result sub- or more sub-category user interface objects, displaying on 

categories based upon both the one of said search result the user interface search results associated with the one or 

categories and the search results that belong to said one more sub -categories corresponding to the sub-category user 

of said search result categories, and interface objects. 

presenting one or more sub-category identifiers corre- 65 20. The method as recited in claim 19, further comprising 

sponding to the one or more search result sub- the step of in response to a fourth user selection of the one 

categories. or more sub -category user interface objects, undisplaying 
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from the user interface the search results associated with the 
one or more sub-categories corresponding to the sub- 
category user interface objects. 

21. A computer system for presenting search results to a 
user, the computer system comprising: 

a user interface; 

one or more processors; and 

a memory commutatively coupled to the one or more 
processors and containing one or more sequences of 
one or mote instructions which, when executed by the 
one or more processors, cause the computer system to 
perform the steps of 
receiving search results, 

dynamically establishing one or more search result cat- 
egories based upon attributes of the search results by 
identifying common attributes among the search 
results, 

generating a set of one or more coefficients that reflect 
the similarity or dissimilarity of the search results 
based upon the common attributes, 

grouping the search results based upon the set of one or 
more coefficients, and 

selecting the one or more categories based upon the 
grouping of the search results; and 

displaying on the user interface the one or more cat- 
egory indicators corresponding to the one or more 
search result categories. 

22. The computer system as recited in claim 21, wherein 
every member of the one or more search result categories is 
a data item that satisfies criteria specified in a query that 
produced the search results. 

23. The computer system as recited in claim 21, wherein 
the step of identifying common attributes among the search 
results is performed using Bayesian clustering techniques. 

24. The computer system as recited in claim 21, wherein 
the step of identifying common attributes among the search 
results is performed using a neural network. 

25. The computer system as recited in claim 21, wherein 
the coefficients are Jaccard coefficients, and 

the step of generating a set of one or more coefficients that 
reflect the similarity of the search results based upon 
the common attributes includes the step of 
generating a set of one or more Jaccard coefficients that 

reflect the similarity of the search results based upon 

the common attributes. 

26. The computer system as recited in claim 21, wherein 
the search results are first search results, 

the memory system further comprises instructions for 
performing the step of applying relevance criteria to the 
first search results to generate second search results that 
satisfy the relevance criteria, and 

the step of dynamically establishing one or more search 
result categories based upon attributes of the search 
results includes the step of dynamically establishing 
one or more search result categories based upon 
attributes of the second search results. 

27. The computer system as recited in claim 21, wherein 
the memory fixer includes instructions for performing the 

step of sorting the search results by the attributes of the 
search results to generate sorted search results, and 
the step of dynamically establishing one or more search 
result categories based upon attributes of the search 
results includes the step of dynamically establishing 
one or more search result categories based upon 
attributes of the sorted search results. 
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28. The computer system as recited in claim 21, wherein 
the search results include a plurality of matching data items 
and the method farther comprises the step of assigning the 
matching data items to the one or more search result 

5 categories. 

29. The computer system as recited in claim 21, wherein 
the memory further includes instructions for performing the 
step of in response to a user selection, presenting search 
results associated with the one or more search result cat- 

10 egories. 

30. The computer system as recited in claim 21, wherein 
the memory further includes instructions for performing the 
steps of 

dynamically establishing one or more search result sub- 
15 categories based upon both the one of said search result 
categories and the search results that belong to said one 
of said search result categories, and 
presenting one or more sub-category identifiers corre- 
sponding to the one or more search result sub- 
20 categories. 

31. The computer system as recited in claim 30, wherein 
the memory further includes instructions for performing the 
step of in response to a user selection, presenting search 
results associated with the one or more sub -categories. 

25 32. A computer system for presenting search results 
comprising: 

one or more processors; and 

a memory communicatively coupled to the one or more 
3Q processors and containing one or more sequences of 
one or more instructions which, when executed by the 
one or more processors, cause the one or more proces- 
sors to perform the steps of: 
receiving search results; 
35 dynamically establishing one or more search result 
categories based upon attributes of the search results; 
presenting one or more category identifiers correspond- 
ing to the one or more search result categories; and 
presenting one or more static category identifiers cor- 
^ responding to one or more static search result cat- 

egories. 

33. The computer system as recited in claim 32, wherein 
the memory further includes one or more additional instruc- 
tions which, when processed by the one or more processors, 
45 cause the one or more processors to perform the steps of 
presenting first search results corresponding to the one or 

more search result categories, and 
presenting second search results corresponding to the one 
or more static search result categories. 
50 34. A computer system for presenting search results 
comprising: 

one or more processors; and 

a memory communicatively coupled to the one or more 
processors and containing one or more sequences of 
55 one or more instructions which, when executed by the 
one or more processor cause the one or more processors 
to perform the steps of: 
receiving search results; 

dynamically establishing one or more candidate search 
60 result categories based upon attributes of the search 

results; 

presenting one or more candidate category identifiers 
corresponding to the one or more candidate search 
result categories; 
65 in response to a user selection of one or more of the one 
or more candidate category identifiers, establishing 
one or more final search result categories based upon 
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both the one or more candidate search result catego- results includes the step of dynamically establishing 

ries and the user selection; and one or more search result categories based upon 

presenting one or more final category identifiers cor- attributes of the sorted search results. 

responding to the one or more final search result 42. The computer-readable medium as recited in claim 35, 

categories 5 wherein the search results include a plurality of matching 

35. A computer-readable medium carrying one or more data items ^ the method farlher comprises the step of 

sequences of one or more instructions for presenting search assigning the matching data items to the one or more search 

results to a user, the one or more sequences of one or more result categories. 

instructions including instructions which, when executed by 43 ^ computer-readable medium as recited in claim 35, 

one or more processors, cause the one or more processors to 10 wherein the computer-readable medium further includes 

perform the steps of" instructions for performing the step of m response to a user 

. „ selection, presenting search results associated with the one 

receiving search results, . t 

L ... or more search result categories, 

dynamically establishing one or more search result cat- .« « , , Ci j* •» j ■ i ■ « 

J . . , .. r « . 1 . 44. The computer-readable medium as recited in claim 35, 

egones based upon attributes of the search results by instructions tm p6rforming me stcps of 

identifying common attributes among the search 15 , .„.,... « L 

results dynamically establishing one or more search result sub- 

gentling a set of one or more coefficients that reflect categories based upon both the one of said search result 

the similarity or dissimilarity of the search results categories and the search results that belong to said one 

based upon the common attributes, of sajd search ^ ^ones, and 

grouping the search results based upon the set of one or 20 presenting one or more sub-category identifiers corre- 

more coefficients, and sponding to the one or more search result sub- 

selecting the one or more categories based upon the categories. 

grouping of the search results; and 4S - ^ computer-readable medium as recited in claim 44, 

displaying on the user interface one or more category faabet m , cludin e instructions for performing the step of in 

indicators corresponding to the one or more search * response to a user selection, presen .ng search resulu asso- 

result cate ones ciated with the one or more sub-categories. 

36 !xhe computer-readable medium as recited in claim 35, 46 - A computer-readable medium for presenting search 

, . , . u results, the computer readable medium carrying one or more 

wherein every member of the one or more search result * f 

, • 1 . it. . K _ • _ •/> 1 • „ sequences of one or more instructions which, when pro- 

categones is a data item that satisfies criteria specified in a M , , 

.u * j j *u u u 30 cessed by one or more processors, cause the one or more 

query that produced the search results. JU J , t r ' 

37. The computer-readable medium as recited in claim 35, Pressors to perform the steps of: 
wherein the step of identifying common attributes among receiving search results; 

the search results is performed using Bayesian clustering dynamically establishing one or more search result cat- 
techniques, egories based upon attributes of the search results, 

38. The computer-readable medium as recited in claim 35, 35 presenting one or more category identifiers corresponding 
wherein the step of identifying common attributes among to the one or more search result categories; and 

the search results is performed using a neural network, presenting one or more static category identifiers corre- 

39. The computer-readable medium as recited in claim 35, sponding to one or more static search result categories, 
wherein 47. The computer-readable medium as recited in claim 46, 

the coefficients are Jaccard coefficients, and 40 further including instructions for performing the steps of 

the step of generating a set of one or more coefficients that presenting first search results corresponding to the one or 

reflect the similarity of the search results based upon more search result categories, and 

the common attributes includes the step of presenting second search results corresponding to the one 

generating a set of one or more Jaccard coefficients that ^ or more static search result categories. 

reflect the similarity of the search results based upon 48. A computer-readable medium for presenting search 

the common attributes. results, the computer readable medium carrying one or more 

40. The computer-readable medium as recited in claim 35, sequences of one or more instructions which, when pro- 
wherein cessed by one or more processors, cause the one or more 

the search results are first search results, 5Q processors to perform the steps of: 

the computer-readable medium further includes instruc- receiving search results; 

tions for performing the step of applying relevance dynamically establishing one or more candidate search 

criteria to the first search results to generate second result categories based upon attributes of the search 

search results that satisfy the relevance criteria, and results; 

the step of dynamically establishing one or more search 55 presenting one or more candidate category identifiers 

result categories based upon attributes of the search corresponding to the one or more search result catego- 

results includes the step of dynamically establishing ries; and 

one or more search result categories based upon in response to a user selection of one or more of the one 

attributes of the second search results. or more candidate category identifiers, establishing one 

41. The computer-readable medium as recited in claim 35, 60 or more final search result categories based upon both 
wherein the one or more candidate search result categories and 

the computer- read able medium further includes instruc- the user selection; and 

tions for performing the step of sorting the search presenting one or more final category identifiers corre- 

results by the attributes of the search results to generate sponding to the one or more final search result catego- 

sorted search results, and 65 ries. 
the step of dynamically establishing one or more search 

result categories based upon attributes of the search ***** 



10/08/2003, EAST Version: 1.04.0000 



